trimmed density ratio estimation
Trimmed Density Ratio Estimation
Density ratio estimation is a vital tool in both machine learning and statistical community. However, due to the unbounded nature of density ratio, the estimation proceudre can be vulnerable to corrupted data points, which often pushes the estimated ratio toward infinity. In this paper, we present a robust estimator which automatically identifies and trims outliers. The proposed estimator has a convex formulation, and the global optimum can be obtained via subgradient descent. We analyze the parameter estimation error of this estimator under high-dimensional settings. Experiments are conducted to verify the effectiveness of the estimator.
Reviews: Trimmed Density Ratio Estimation
Summary: This paper proposes a "trimmed" estimator that robustly (to outliers) estimates the ratio of two densities, assuming an a exponential family model. This robustness is important, as density ratios can inherently be very unstable when the denominator is small. The proposed model is based on an optimization problem, motivated by minimizing KL divergence between the two densities in the ratio, and is made more computationally tractable by re-expressing it in terms of an equivalent saddle-point/max-min formulation. Similar to the one-class SVM, this formulation explicitly discards a portion (determined by a tuning parameter) of "outlier" samples. The density-ratio estimator is shown to be consistent in two practical settings, one in which the data contains a small portion of explicit outliers and another in which the estimand is intrinsically unstable.
Trimmed Density Ratio Estimation
Liu, Song, Takeda, Akiko, Suzuki, Taiji, Fukumizu, Kenji
Density ratio estimation is a vital tool in both machine learning and statistical community. However, due to the unbounded nature of density ratio, the estimation proceudre can be vulnerable to corrupted data points, which often pushes the estimated ratio toward infinity. In this paper, we present a robust estimator which automatically identifies and trims outliers. The proposed estimator has a convex formulation, and the global optimum can be obtained via subgradient descent. We analyze the parameter estimation error of this estimator under high-dimensional settings.